serverless computing
Advanced Architectures Integrated with Agentic AI for Next-Generation Wireless Networks
Dev, Kapal, Khowaja, Sunder Ali, Zeydan, Engin, Debbah, Merouane
This paper investigates a range of cutting-edge technologies and architectural innovations aimed at simplifying network operations, reducing operational expenditure (OpEx), and enabling the deployment of new service models. The focus is on (i) Proposing novel, more efficient 6G architectures, with both Control and User planes enabling the seamless expansion of services, while addressing long-term 6G network evolution. (ii) Exploring advanced techniques for constrained artificial intelligence (AI) operations, particularly the design of AI agents for real-time learning, optimizing energy consumption, and the allocation of computational resources. (iii) Identifying technologies and architectures that support the orchestration of backend services using serverless computing models across multiple domains, particularly for vertical industries. (iv) Introducing optically-based, ultra-high-speed, low-latency network architectures, with fast optical switching and real-time control, replacing conventional electronic switching to reduce power consumption by an order of magnitude.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Research Report (1.00)
- Overview > Innovation (0.48)
- Telecommunications (0.94)
- Energy (0.68)
- Information Technology > Security & Privacy (0.47)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Architecture > Real Time Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Scalable and Cost-Efficient ML Inference: Parallel Batch Processing with Serverless Functions
As data-intensive applications grow, batch processing in limited-resource environments faces scalability and resource management challenges. Serverless computing offers a flexible alternative, enabling dynamic resource allocation and automatic scaling. This paper explores how serverless architectures can make large-scale ML inference tasks faster and cost-effective by decomposing monolithic processes into parallel functions. Through a case study on sentiment analysis using the DistilBERT model and the IMDb dataset, we demonstrate that serverless parallel processing can reduce execution time by over 95% compared to monolithic approaches, at the same cost.
Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud
These models, due to their size--often reaching hundreds of gigabytes--and computational requirements, encounter delays due to what is known as the coldstart This review report discusses the cold start latency in problem [22]. This latency arises when serverless serverless inference and existing solutions. It particularly functions, previously idle, initiate, leading to delays reviews the ServerlessLLM method, a system from the loading of extensive LLM checkpoints designed to address the cold-start problem in serverless and GPU resource activation. Such cold starts can inference for large language models (LLMs). Traditional significantly hinder performance in applications requiring serverless approaches struggle with high latency real-time interaction, making solutions to this due to the size of LLM checkpoints and the problem imperative for scalable, serverless LLM deployment.
StraightLine: An End-to-End Resource-Aware Scheduler for Machine Learning Application Requests
Ching, Cheng-Wei, Guan, Boyuan, Xu, Hailu, Hu, Liting
The life cycle of machine learning (ML) applications consists of two stages: model development and model deployment. However, traditional ML systems (e.g., training-specific or inference-specific systems) focus on one particular stage or phase of the life cycle of ML applications. These systems often aim at optimizing model training or accelerating model inference, and they frequently assume homogeneous infrastructure, which may not always reflect real-world scenarios that include cloud data centers, local servers, containers, and serverless platforms. The key innovation is an empirical dynamic placing algorithm that intelligently places requests based on their unique characteristics (e.g., request frequency, input data size, and data distribution). In contrast to existing ML systems, StraightLine offers end-to-end resource-aware placement, thereby it can significantly reduce response time and failure rate for model deployment when facing different computing resources in the hybrid infrastructure.
- North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
Detection of Compromised Functions in a Serverless Cloud Environment
Lavi, Danielle, Brodt, Oleg, Mimran, Dudu, Elovici, Yuval, Shabtai, Asaf
Serverless computing is an emerging cloud paradigm with serverless functions at its core. While serverless environments enable software developers to focus on developing applications without the need to actively manage the underlying runtime infrastructure, they open the door to a wide variety of security threats that can be challenging to mitigate with existing methods. Existing security solutions do not apply to all serverless architectures, since they require significant modifications to the serverless infrastructure or rely on third-party services for the collection of more detailed data. In this paper, we present an extendable serverless security threat detection model that leverages cloud providers' native monitoring tools to detect anomalous behavior in serverless applications. Our model aims to detect compromised serverless functions by identifying post-exploitation abnormal behavior related to different types of attacks on serverless functions, and therefore, it is a last line of defense. Our approach is not tied to any specific serverless application, is agnostic to the type of threats, and is adaptable through model adjustments. To evaluate our model's performance, we developed a serverless cybersecurity testbed in an AWS cloud environment, which includes two different serverless applications and simulates a variety of attack scenarios that cover the main security threats faced by serverless functions. Our evaluation demonstrates our model's ability to detect all implemented attacks while maintaining a negligible false alarm rate.
AIhub monthly digest: March 2024 – human-robot interaction, serverless computing, and deep reinforcement learning for communication networks
Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we find out about explainability and human-robot interaction, serverless computing for machine learning, and deep reinforcement learning for communication networks. We also chat to AAAI President Francesca Rossi, and congratulate the ACM/SIGAI Autonomous Agents Research Award winner Catholijn Jonker. "AI used to be a scientific and technical field, now it has become a socio-technical discipline." AIhub ambassador Andrea Rafai caught up with AAAI President Francesca Rossi to ask about her research, regulation of AI, and the UN sustainable development goals: Interview with Francesca Rossi – talking sustainable development goals, AI regulation, and AI ethics.
Interview with Amine Barrak: serverless computing and machine learning
The AAAI/SIGAI Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. This year, 30 students were selected for this programme, and we've been hearing from them about their research. In this interview, Amine Barrak, tells us about his work speeding up machine learning by using serverless computing. My focus is on speeding up machine learning by using serverless computing. My research is about finding a way to do machine learning training efficiently in small serverless settings.
- Africa > Middle East > Tunisia (0.17)
- North America > Canada > Ontario > Toronto (0.15)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.64)
- Energy > Oil & Gas > Downstream (0.64)
A Review of Deep Reinforcement Learning in Serverless Computing: Function Scheduling and Resource Auto-Scaling
Majid, Amjad Yousef, Marin, Eduard
In the rapidly evolving field of serverless computing, efficient function scheduling and resource scaling are critical for optimizing performance and cost. This paper presents a comprehensive review of the application of Deep Reinforcement Learning (DRL) techniques in these areas. We begin by providing an overview of serverless computing, highlighting its benefits and challenges, with a particular focus on function scheduling and resource scaling. We then delve into the principles of deep reinforcement learning (DRL) and its potential for addressing these challenges. A systematic review of recent studies applying DRL to serverless computing is presented, covering various algorithms, models, and performances. Our analysis reveals that DRL, with its ability to learn and adapt from an environment, shows promising results in improving the efficiency of function scheduling and resource scaling in serverless computing. However, several challenges remain, including the need for more realistic simulation environments, handling of cold starts, and the trade-off between learning time and scheduling performance. We conclude by discussing potential future directions for this research area, emphasizing the need for more robust DRL models, better benchmarking methods, and the exploration of multi-agent reinforcement learning for more complex serverless architectures. This review serves as a valuable resource for researchers and practitioners aiming to understand and advance the application of DRL in serverless computing.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Spain > Galicia > Madrid (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Overview (1.00)
- Research Report > New Finding (0.47)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Services (0.69)
- Leisure & Entertainment > Games (0.66)
Exploring the Impact of Serverless Computing on Peer To Peer Training Machine Learning
Barrak, Amine, Trabelsi, Ranim, Jaafar, Fehmi, Petrillo, Fabio
The increasing demand for computational power in big data and machine learning has driven the development of distributed training methodologies. Among these, peer-to-peer (P2P) networks provide advantages such as enhanced scalability and fault tolerance. However, they also encounter challenges related to resource consumption, costs, and communication overhead as the number of participating peers grows. In this paper, we introduce a novel architecture that combines serverless computing with P2P networks for distributed training and present a method for efficient parallel gradient computation under resource constraints. Our findings show a significant enhancement in gradient computation time, with up to a 97.34\% improvement compared to conventional P2P distributed training methods. As for costs, our examination confirmed that the serverless architecture could incur higher expenses, reaching up to 5.4 times more than instance-based architectures. It is essential to consider that these higher costs are associated with marked improvements in computation time, particularly under resource-constrained scenarios. Despite the cost-time trade-off, the serverless approach still holds promise due to its pay-as-you-go model. Utilizing dynamic resource allocation, it enables faster training times and optimized resource utilization, making it a promising candidate for a wide range of machine learning applications.
- South America > Uruguay > Artigas > Artigas (0.05)
- North America > United States > Washington > King County > Renton (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- (3 more...)
SPIRT: A Fault-Tolerant and Reliable Peer-to-Peer Serverless ML Training Architecture
Barrak, Amine, Jaziri, Mayssa, Trabelsi, Ranim, Jaafar, Fehmi, Petrillo, Fabio
The advent of serverless computing has ushered in notable advancements in distributed machine learning, particularly within parameter server-based architectures. Yet, the integration of serverless features within peer-to-peer (P2P) distributed networks remains largely uncharted. In this paper, we introduce SPIRT, a fault-tolerant, reliable, and secure serverless P2P ML training architecture. designed to bridge this existing gap. Capitalizing on the inherent robustness and reliability innate to P2P systems, SPIRT employs RedisAI for in-database operations, leading to an 82\% reduction in the time required for model updates and gradient averaging across a variety of models and batch sizes. This architecture showcases resilience against peer failures and adeptly manages the integration of new peers, thereby highlighting its fault-tolerant characteristics and scalability. Furthermore, SPIRT ensures secure communication between peers, enhancing the reliability of distributed machine learning tasks. Even in the face of Byzantine attacks, the system's robust aggregation algorithms maintain high levels of accuracy. These findings illuminate the promising potential of serverless architectures in P2P distributed machine learning, offering a significant stride towards the development of more efficient, scalable, and resilient applications.
- South America > Uruguay > Artigas > Artigas (0.05)
- North America > Canada > Quebec > Saguenay-Lac-Saint-Jean Region > Saguenay (0.04)
- North America > Canada > Quebec > Montreal (0.04)